Vector and Matrix Calculus

Table of Contents

  Scalar Field Vector Field
0th Derivative \( f \) \(\mathbf{f}\)
1st Derivative \(\nabla f\) Gradient \(J(\mathbf{f})\) Jacobian
    \(\supset \nabla\cdot\mathbf{f}\) Divergence and \(\nabla\times\mathbf{f}\) Curl
2nd Derivative \(H(f)\) Hessian  
  \(\supset\nabla^2f\) Laplacian  

1. Gradient

  • Vector field that represents the rate of change in a space.

1.1. Definition

  • For a morphism \(f\colon X\to Y\), the gradient \(\nabla f\colon X\to Z\) is a linear map, such that \[ dy = \langle \nabla f, dx\rangle \] in which bilinear map \(\langle \cdot, \cdot \rangle\colon Z\times X \to Y\) is well-defined.

1.1.1. Orthogonal Curvilinear Coordinate System

  • \[ \nabla f = \frac{1}{h_i}\frac{\partial f}{\partial x^i} \mathbf{e}_i \]
  • where \[ h_i = \left\| \frac{\partial \mathbf{r}}{\partial \tilde{x}^i}\right\|. \]

1.2. Properties

2. Divergence

2.1. Definition

  • Divergence of a vector field \(\mathbf{F}\) is \[ \nabla\cdot \mathbf{F} = \frac{\partial F_{x_i}}{\partial {x_i}}. \]

2.1.1. Orthogonal Curvilinear Coordinate System

  • \[ \nabla\cdot \mathbf{F} = \frac{1}{\prod_j h_j}\left(\frac{\partial}{\partial x^i}\prod_{j\neq i}h_jF^i\right) \] where \[ h_i = \left\| \frac{\partial \mathbf{r}}{\partial \tilde{x}^i}\right\|. \]

2.2. Interpretation

  1. The net flux through a unit volume.
  2. The rate of change of the ratio of volume (the ratio of the rate of change in volume, rate of change of a unit volume) subjected to the flow of a vector field.
    • For a vector field given by a linear transformation:
      • \[ \nabla\cdot(\mathbf{Ax}) = \frac{d}{dt}\ln V \]
    • The infinitesimal transformation generated by the vector field \(\mathbf{F}\) is: \[ \tilde{x}^i = x^i + F^idt \]
      • The Jacobian of the transformation would be: \[ J_i^j = \begin{bmatrix} 1 + \partial_{x^1}F^1dt & \partial_{x^2}F^1dt & \cdots & \partial_{x^n}F^1dt \\ \partial_{x^1}F^2dt & 1 + \partial_{x^2}F^2dt & \cdots & \partial_{x^n}F^1dt \\ \vdots & \vdots & \ddots & \vdots \\ \partial_{x^1}F^ndt & \partial_{x^2}F^ndt & \cdots & 1+ \partial_{x^n}F^ndt \\ \end{bmatrix} \]
      • And the determinant is: \[ \det J_i^j = 1 + \nabla\cdot \mathbf{F}\,dt + O(dt^2) \]
      • By taking the derivative of that: \[ \frac{d}{dt} \det J_i^j = \nabla\cdot \mathbf{F} \]

3. Curl

3.1. Generalization

3.1.1. Orthogonal Curvilinear Coordinate System

  • \[ \nabla\times \mathbf{F} = \frac{1}{h_1h_2h_3}\begin{vmatrix} h_1\tilde{\mathbf{e}}_1 & h_2\tilde{\mathbf{e}}_2 & h_3\tilde{\mathbf{e}}_3 \\[.5em] \dfrac{\partial}{\partial \tilde{x}^1} & \dfrac{\partial}{\partial \tilde{x}^2} & \dfrac{\partial}{\partial \tilde{x}^3} \\[1em] h_1\tilde{F}^1 & h_2\tilde{F}^2 & h_3\tilde{F}^3 \\ \end{vmatrix} \]
  • where \[ h_i = \left\| \frac{\partial \mathbf{r}}{\partial \tilde{x}^i}\right\|. \]

3.1.2. General Coordinate System

  • \[ (\nabla \times \mathbf{F} )^k = \frac{1}{\sqrt{g}} \varepsilon^{k\ell m} (\nabla_\ell \mathbf{F})_m \]
  • By the symmetry of the Christoffel symbols , \[ (\nabla \times \mathbf{F} ) = \frac{1}{\sqrt{g}} \mathbf{e}_k\varepsilon^{k\ell m} \partial_\ell F_m \]

3.1.3. Differential Form

  • \[ \left(\star(\mathrm{d}\mathbf{F}^\flat)\right)^\sharp \]
  • where \(\flat\) and \(\sharp\) are the musical isomorphisms that takes the basis vectors into corresponding basis 1-forms.

4. Laplacian

4.1. Definition

  • \[ \nabla^{\cdot 2} f = \nabla\cdot\nabla f \]
  • \(\nabla^2\) is used in physics, and \(\Delta\) is used in mathematics.

4.2. Properties

5. Jacobian

  • Transformation between curvilinear coordinate systems.

5.1. Definition

A Jacobian matrix of a vector field \(\mathbf{f}\) is \[ J^{i}{}_{j}=\frac{\partial f^i}{\partial x^j} \] where \(i\) is the row number and \(j\) is the column number.

It tells the rate of change in the vector field in any direction. Consider the identity: \( \mathrm{d}f^i=J_{j}^{i}\mathrm{d}x^j \) or equivalently, \( \mathrm{d}\mathbf{f}=\mathbf{J}\mathrm{d}\mathbf{x} \).

Beware that some people prefer to use the transpose of this Jacobian as their Jacobian.

5.2. Inverse

\[ J^{-1}{}^i{}_j := \frac{\partial x^i}{\partial f^j} \] The inverse matrix can also be written concisely as \[ J^{-1}{}^i{}_j = J_j{}^i. \]

  • Reciprocate each element and transpose the Jacobian matrix.

5.3. Change of Basis

A Jacobian of coordinate transformation from coordinates \(x^j\) to coordinates \(\tilde{x}^i\) is \[ J^i{}_j=\frac{\partial \tilde{x}^i}{\partial x^j} \] which transforms the components.

To transform the basis, the inverse Jacobian is used. \[ \frac{\partial}{\partial \tilde{x}^j}=J_j{}^i\frac{\partial}{\partial x^i} \] equivalently, \[ \begin{bmatrix}\tilde{\mathbf{e}}_{1}&\tilde{\mathbf{e}}_{2}&\cdots&\tilde{\mathbf{e}}_{n}\end{bmatrix}=\begin{bmatrix}\mathbf{e}_{1}&\mathbf{e}_{2}&\cdots&\mathbf{e}_{n}\end{bmatrix}\mathbf{J}^{-1}. \]

\(\mathbf{J} : TM \to TN\) \(TM \to TN\) \(TN \to TM\)
Covariant \(\mathbf{J}^{-1}\) \(\mathbf{J}\)
Contravariant \(\mathbf{J}\) \(\mathbf{J}^{-1}\)

5.4. Determinant

The determinant of the Jacobian is the ratio of volumes due to transformation. Thus used as the factor in the change of the measure of an integral.

6. Hessian

6.1. Definition

Hessian \(\mathbf{H}\) of a twice-differentiable scalar field \(f\) is: \[ H_{ij} = \frac{\partial^2 f}{\partial x^i\partial x^j}. \]

6.2. Properties

  • Hessian matrix is the transpose of the Jacobian matrix of the gradient.
    • excalidraw:./hessian.excalidraw
    • \((\mathrm{d}\mathbf{x})^{\rm T}\mathbf{H}[f]\mathrm{d}\mathbf{x} = (\mathrm{d}\nabla f)^{\rm T}\mathrm{d}\mathbf{x}.\)
    • If it is evaluated at a stationary point, then \(\mathrm{d}\nabla f\) would point in the direction of the gradient \(\nabla f\).
    • Notice that \(\nabla f\) is the normal map, namely, a Gauss map.
  • If the Hessian is positive-definite at \(\mathbf{x}\), then \(f\) attains an isolated local mimimum at \(\mathbf{x}\), by the same note, if the Hessian is negative-definite, then \(f\) attains an isolated local maximum.

7. Identities

8. Notations

There exists two main notational convention in taking derivative with respect to a vector or a matrix: numerator layout convention and denominator layout convention. They have their own advantages and disadvantages, and some even mix and match them. It is generally recommended to follow the layout of the textbook presented.

The numerator layout treats the vector in the numerator as a column vector, and the vector in the denominator as a row vector. For example, \[ \frac{\partial \mathbf{y}}{\partial \mathbf{x}} = \begin{bmatrix} \frac{\partial y_1}{\partial x_1} & \frac{\partial y_1}{\partial x_2} & \cdots & \frac{\partial y_1}{\partial x_n} \\ \frac{\partial y_2}{\partial x_1} & \frac{\partial y_2}{\partial x_2} & \cdots & \frac{\partial y_2}{\partial x_n} \\ \vdots & \vdots &\ddots & \vdots \\ \frac{\partial y_n}{\partial x_1} & \frac{\partial y_n}{\partial x_2} & \cdots & \frac{\partial y_n}{\partial x_n} \\ \end{bmatrix}. \] which matches the layout of the standard Jacobian.

Similarly, the denominator layout treats the vector in the numerator as a row vector, and the vector in the numerator as a column vector. For example, \[ \frac{\partial f}{\partial \mathbf{x}} = \begin{bmatrix} \frac{\partial f}{\partial x_1} \\ \frac{\partial f}{\partial x_2} \\ \vdots \\ \frac{\partial f}{\partial x_n} \\ \end{bmatrix} \] which matches the layout of the standard gradient.

A matrix can be used in either the numerator or denominator, but not both. When a matrix in in the denominator, it is treated as the transpose of itself. In these matrix calculus notation, tensors whose ranks are higher than 2 is not the subject of interest.

This notation is just for convenience. See Matrix calculus - Wikipedia for more.

9. Derivative

9.1. Leibniz rule

\[ \frac{d}{dx}(\mathbf{A}\mathbf{B}) = \frac{d\mathbf{A}}{dx}\mathbf{B} + \mathbf{A}\frac{d\mathbf{B}}{dx} \]

9.2. Of Inverse Matrix

\[\frac{d\mathbf{A}^{-1}}{dx}=-\mathbf{A}^{-1}\frac{d\mathbf{A}}{dx}\mathbf{A}^{-1}\]

10. Exponential

  • \[ e^{\mathbf{A}} := \sum_{n=0}^\infty \frac{\mathbf{A}^n}{n!}. \]

10.1. Properties

  • \[ \mathbf{A}\mathbf{B} = \mathbf{B}\mathbf{A} \iff e^{\mathbf{A}}e^\mathbf{B} = e^{\mathbf{A}+\mathbf{B}} \]
  • \[ e^\mathbf{O} = \mathbf{I},\quad \left(e^\mathbf{A}\right)^{-1} = e^{-\mathbf{A}},\quad \left(e^\mathbf{A}\right)^n = e^{n\mathbf{A}} \]
  • \[ \left(e^{\mathbf{A}}\right)^{\mathrm T} = e^{\mathbf{A}^\mathrm{T}}, \quad \operatorname{det}\left(e^\mathbf{A}\right) = e^{\operatorname{tr}(\mathbf{A})} \]
  • If \(\mathbf{A}\) is diagonalizable: \[ e^{\mathbf{A}} = \mathbf{V}e^{\mathbf{\Lambda}}\mathbf{V}^{-1}. \]
  • The solution to the differential equation: \[ \mathbf{y}' = \mathbf{A}\mathbf{y} \] is the matrix exponential: \[ e^{\mathbf{A}t}\mathbf{y}_0 \] for any square matrix \(\mathbf{A}\).

11. Jacobi's Formula

11.1. Formula

  • \[ \frac{d}{dt}\det \mathbf{A}(t) = \operatorname{tr}\left(\operatorname{adj}(\mathbf{A}(t))\frac{d\mathbf{A}(t)}{dt}\right) \] where \(\operatorname{adj}\) is the adjugate matrix.
  • If \(\mathbf{A}\) is invertible, it can further be said to be \[ \frac{d}{dt}\det\mathbf{A} = \det(\mathbf{A}(t)) \operatorname{tr}\left(\mathbf{A}^{-1}(t)\frac{d}{dt}\mathbf{A}(t)\right) \]

11.2. Properties

  • This means
    • \[ \frac{\partial \det\mathbf{A}}{\partial A_{ij}} = (\operatorname{adj}\mathbf{A})_{ji} = (\mathbf{C})_{ij}, \]
    • \[ d\det(\mathbf{A}) = \operatorname{tr}(\operatorname{adj}(\mathbf{A})\,d\mathbf{A}) = \langle (\operatorname{adj}\mathbf{A})^{\rm T}, d\mathbf{A}\rangle_{\rm F}, \]
    • \[ \nabla \operatorname{det}(\mathbf{A}) = (\operatorname{adj}\mathbf{A})^{\rm T} = \mathbf{C}, \]

12. Reference

Created: 2025-05-25 Sun 02:36